Search CORE

CiteSeerX

arXiv.org e-Print Archive

Solving Maximum Clique Problem for Protein Structure Similarity

Author: Andonov Rumen
Malod-Dognin Noël
Yanev Nicola
Publication venue
Publication date: 01/01/2009
Field of study

A basic assumption of molecular biology is that proteins sharing close three-dimensional (3D) structures are likely to share a common function and in most cases derive from a same ancestor. Computing the similarity between two protein structures is therefore a crucial task and has been extensively investigated. Evaluating the similarity of two proteins can be done by finding an optimal one-to-one matching between their components, which is equivalent to identifying a maximum weighted clique in a specific "alignment graph". In this paper we present a new integer programming formulation for solving such clique problems. The model has been implemented using the ILOG CPLEX Callable Library. In addition, we designed a dedicated branch and bound algorithm for solving the maximum cardinality clique problem. Both approaches have been integrated in VAST (Vector Alignment Search Tool) - a software for aligning protein 3D structures largely used in NCBI (National Center for Biotechnology Information). The original VAST clique solver uses the well known Bron and Kerbosh algorithm (BK). Our computational results on real life protein alignment instances show that our branch and bound algorithm is up to 116 times faster than BK for the largest proteins

Bulgarian Digital Mathematics Library at IMI-BAS

Bulgarian Digital Mathematics Library at IMI-BAS

N–Dimensional Orthogonal Tile Sizing Problem

Author: Andonov Rumen
Yanev Nicola
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/1998
Field of study

AMS subject classification: 68Q22, 90C90We discuss in this paper the problem of generating highly efficient code when a n + 1-dimensional nested loop program is executed on a n-dimensional torus/grid of distributed-memory general-purpose machines. We focus on a class of uniform recurrences with non-negative components of the dependency matrix. Using tiling the iteration space strategy we show that minimizing the total running time reduces to solving a non-trivial non-linear integer optimization problem. For the later we present a mathematical framework that enables us to derive an O(n log n) algorithm for finding a good approximate solution. The theoretical evaluations and the experimental results show that the obtained solution approximates the original minimum sufficiently well in the context of the considered problem. Such algorithm is realtime usable for very large values of n and can be used as optimization techniques in parallelizing compilers as well as in performance tuning of parallel codes by hand

Lagrangian Approaches for a class of Matching Problems in Computational Biology

Author: Andonov Rumen
Balev Stefan
Veber Philippe
Yanev Nicola
Publication venue
Publication date: 01/01/2006
Field of study

This paper presents efficient algorithms for solving the problem of aligning a protein structure template to a query amino-acid sequence, known as protein threading problem. We consider the problem as a special case of graph matching problem. We give formal graph and integer programming models of the problem. After studying the properties of these models, we propose two kinds of Lagrangian relaxation for solving them. We present experimental results on real life instances showing the efficiency of our approaches

arXiv.org e-Print Archive

HAL - Normandie Université

Flexible Alignments for Protein Threading

Author: Andonov Rumen
Collet Guillaume
Gibrat Jean-François
Yanev Nicola
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

We present a new local alignment method for the protein threading problem. Local sequence-sequence alignments are widely used to find functionally important regions in families of proteins. However, to the best of our knowledge, no local sequence-structure alignment algorithm has been described in the literature. Here we model local alignments as Mixed Integer Programming (MIP) models. These models permit to align a part of a protein structure onto a protein sequence in order to detect local similarities. The paper describes two MIP models, compares and analyzes their performance by using ILOG CPLEX 10 solver

HAL Descartes

Comparing Protein 3D Structures Using A_purva

Author: Andonov Rumen
Malod-Dognin Noël
Yanev Nicola
Publication venue: HAL CCSD
Publication date: 25/11/2010
Field of study

Structural similarity between proteins provides significant insights about their functions. Maximum Contact Map Overlap maximization (CMO) received sustained attention during the past decade and can be considered today as a credible protein structure measure. We present here A_purva, an exact CMO solver that is both efficient (notably faster than the previous exact algorithms), and reliable (providing accurate upper and lower bounds of the solution). These properties make it applicable for large-scale protein comparison and classification. Availability: http://apurva.genouest.org Contact: [email protected] Supplementary information: A_purva's user manual, as well as many examples of protein contact maps can be found on A_purva's web-page.La similarité structurale entre protéines donne des renseignements importants sur leurs fonctions. La maximisation du recouvrement de cartes de contacts (CMO) a reçu une attention soutenue ces dix dernières années, et est maintenant considérée comme une mesure de similarité crédible. Nous présentons içi A_purva, un solveur de CMO exacte qui est à la fois efficace (plus rapide que les autres algorithmes exactes) et fiable (fournit des bornes supérieures et inférieures précises de la solution). Ces propriétés le rendent applicable pour des comparaisons et des classifications de protéines à grandes échelles. Disponibilité : http://apurva.genouest.org Contact : [email protected] Informations supplémentaires : Le manuel utilisateur d'A_purva, ainsi que de nombreux exemples de cartes de contacts de protéines sont disponibles sur le site web d'A_purva

Modèle de PLNE pour la recherche de cliques de poids maximal

Author: Andonov Rumen
Gibrat Jean-François
Malod-Dognin Noël
Yanev Nicola
Publication venue: HAL CCSD
Publication date: 25/02/2008
Field of study

National audienceEstimating the similarity of two protein structures is a very important task in biology. It is usually based on an alignment, i.e. a one to one matching between the amino-acids of each protein. Between all the methods for aligning proteins we are interested in VAST, which first aligns the secondary structures (SSE) and then extends this alignment to the amino-acids. The SSEs alignment is presented as a maximum clique problem in a particular graph. In this paper we propose a new integer programming model for various maximum weight clique problems and we successfully applied it in VAST

The Protein Threading Problem is in P?

Author: Andonov Rumen
Yanev Nicola
Publication venue: HAL CCSD
Publication date: 01/01/2002
Field of study

This work is about a problem from computational biology known as protein threading problem. By finding out an appropriate linear mixed-integer programming (MIP) formulation we demonstrate that the real-live instances of this problem could be efficiently solved by using only some linear-programming (LP) solver instead of special-purpose branch&bound algorithm. This is due to the fact that within the frame of MIP model proposed, all biological instances, we were able to test, attain their optima at feasible vertices of the underlying LP polytope which is the essence of the statement in the title

A Novel Algorithm for Finding Maximum Common Ordered Subgraph

Author: Andonov Rumen
Yanev Nicola
Publication venue: HAL CCSD
Publication date: 01/01/2007
Field of study

In this paper, we study the following problem: given are adjacency matrices of two simple graphs. Find two principal matrices (though they are vectors) having the maximum inner product. When used for computing the similarity of two protein structures this problem is called contact map overlap and for the later, we give an exact B&B algorithm with bounds computed by solving Lagrangian relaxation of the problem. The efficiency of the approach is demonstrated on a popular benchmark set of instances together with a comparison with the best existing algorithm